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?he arcxoents for evaluation as a ^e-^vice versus 



evaluation as an acccun-tability function are debated. Jroponents of • 
tbe evaluation as service approach often argue "that inforttaticn 
requested for self -renewal is the inf oritation itost liiely to be aeed 
vhile those vhc favor accountability as the appropriate role for 
eraluatibn claiji that pecple viii seek' out eTa;Luation qf thenselres 
only if it is not dangerous to then persoixally. These points: and 
dthers are argued point "^nd counterpoint; a Resolution between the 
t*o vievs is sought. (Aathor) 
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Ifc^dels for rbe Dellverj of ,Scifc?ol District graluariCTL 
Serricft or Acar^tablllry? 



ruzzr r'^<nVAr^7 Underlies usanj of the organizatioaal scheinas 



for eralaatiais in public schools ^oaav — fuzzy riiziing about the 
purpc^ses of eraluatior and the underlying -phtlosophtes zhaz govern 
the organizational ^rructcre of evalr^tion units to achie-re those 
pul-p'^se:s. rne vav a scbo-1 district orzanizes to ic eraluarien 
sh^iild directly reelect both its purpose in evaluating its prograu 
and its philosophies of adininistration and educational change* 
Evaluators th^elres should be lieenly attmed to these local - • ^ 
ideas, either directing their vork to the 'existing 
purposes and philosophies or vorkiag to change them* 

Tne Austin ISB Office pf Research and Evaliaation created three 
years ago in a district vhere the Soard and public ve»e clamoring 



for 'accountability. In reexamining oui- actions over the past 
three years as we have struggled to •institutionalize this nev unit, 

c 

ve believe vfe have been guilty of <he fuzzy th inkin g alluded to aoove. 
We tried to serve -op a "servi^^e" unit vi^esi the dish our paying clients 
♦were ordering vas an "accomtability'* one. In foxxnd tjiat 

ve vere xmc^ear three years a^o about wio our real clients ^vere end 
hence vejre misaddressing much of our work and reporting. Moreover, 

ve tiiink ve -vere deliidii^ xnxrs elves vith some rpmaatic xiotions about 

' .* • ♦ » 

human bel^avior change that the data. simply vill not support. 
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Ir the lest three years, ve nave laid c?u; rwD dichotoaoiis mr^dels 
of *'€valu3rlDz for sx^Dust^bllirv*' arid 'Vralusrion as a serrice fuDCtic?:.*' 
Bt exsitinlni: these rvs nodels, ve have ccn:»e better iric^ersr^iid cur 
c^acliational role. Tsis «^lf-ex32ninaiior. has r^ad ccotc^Litsnt effects 
gi, tbs fcctis of .DUX vcrk, the establi^hirL* of priorities, the e«err.al 
an-d icrtertial or£azl2^t:>ot: of o:ir Asnit, a=?d particularly on" our retcrtln* 
f cruets and styles. Kiost of all it ^.elped £>ar eval-j^ticc unit na**'^ an' 
ict on ecuc£tiocia> ;;ra:tice zn o-.r district. 

:s soae cnncepts tinat uncerc^n tbe tvc nic-cel s 
tc ce cescrccec. 7h-sn lavs o-t tr.e.cvc nc^dels. 7r:ese mc'dels zr^ 
describ-ed : irst »as tvc e>r:r€nrres, crcoably neltner of vhich exist m - 

pure font- In any scbo-l district, A thitd mc<!el, a cofroroiLis^ of these tvo 
- ^ « 

e^rrreaes,. vhicb can result froni an analysis of these f ifsi rvo ncdels .is 
t'zm raved - « • . " 



r-ics ^^er first prese:nti 



xne Client fo r Svaltj^tirn 

7^ ♦ . ' • • 

Client rs-a crucial tern. Client^s^may be one of rvo types: 

* % 

^iii»nt-^ur chaser or clier»t-re;ipi^t. Tne disr^hguisning i^ture- Derveen 
these cvo types of clients is that the r lieat^purchaser has direct pover 
to purdiase while the client-recipient nay receive goods or servip^ only 
if he influences those vho ha-te purchase-pover to procure the services for 
his. ^ Using this definition^ the cl±ent-p\jrchas^r vill be the acministration 
(superintecdeac, board,. and public) .^^le the client-recipient will be th^ 
school personnel vno rdgbt use the evaltiatioa serri^ce for their purposes. 

Evaluation Purpose 

% define the purpose of evaluation as the use or potential use to 

vhicb evaluation nay be addressed. Our office identifies the ultizate - 
purpose of evaluation as "the irprpvesieat of student learning obtccoe^. 

5 ' . 

J ■ 2 ■ , ■ 



All clienrs, vt^zher pavirtg or iisirig, vill agree tieoreticallv virh this 
ferial tico. luter^adij^ze parpo-ses, hoverer, insv r^or be so clear. While 
Ir Is. nr^e chat the ptsbllc, schcol bc^rd, arid super intec^ent insv veil 
bare political evsliiatlro purpases in irircd, tie schcpl district ^taff 
may even siore ''pcllticallj" notivateo, because tber hare at sta>te the 



"Vers* 



persc^l purpose of professional a:^dT^c€Eient or job reterztior:. Trnas, 
eval^iators are alvavs faced virt a political realirv in v^iich nezatire 
evaluation infoniiation is rarelv veil received. Glass <'19~5) has iis.de 
avare of the oaradoxlcal cliiLate in vr.lch evaluation iafcriLatlon is 
generated and usee. ' . ^ 

Despite tne problems asso-ciated vlth defining the purpose of evaliiati 
tte vise evaliiator should under^tattd the enviroixnient in vhicb he op>erates 
and the potential p-crpos^ for evaluation vith both kinds of ^clients 
identified. 

Philosophy and Evaluation • • ' ^ . • 

Taxcughout the histor>- of man, tvo confli*"ting vievs ,of human nature 

haje^ operated. ' In one view, is viewed as- inaatelj good. As stich, 

tje vill ^hr% f>co 'K^hj»v e rationall y and liindly tovard his fellow iian, 

unless societjT (vieved as an evil corrupter) influences^ him to behave ^ 

vtongfullv. In the alternative view, man is seen as bom with ^'original 

sio." Here, man is vieved as requiring exorid«m of evil through '^z.px:tsxsL^ 

\ ^ • . . " ' ^ \ ^ 

training, or other continued vigilance, i^re modern philosophers' maj' 

phrase" This as innate self-centeredness. In education xhe basically 

good vision of man is expressed in the ^ilo^ophies or Hpusseau and ^ 

such com^z^oxzrt^ roaanjticists as Holt, a&hl, and lUicb. The evil view 

of mati is exenplified in th^ "*.^ld viev" of Sevton (Mink) -and more 

•receatl? the "fundareatal school" and "baste skills" citizen groups. 

3 
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^_!fn?»?g^ the relatirashlp laay a:pp^r colique, these philoscphical 
dicbotoniics csn be seen triderlTiitg the two itcdels of eva-luatlra as 
accc:3itabiiity, or service. 

Tne clients for evaluation here are clearly client-purchasers — 
the tqp level aditinlstrators, r-^ard, aM thus iiidirectlv the breeder 
purlic. rteir need of an evaluation unit is for a brief, reliable 
report on vhether tne prczraTi- worited or -didn't v|^k.^in terms cf orodlcts 
or student outccnaes in croer to make decisions. Tney vill be parti cularl; 
-interested in cost Infonnation. Kot only vill ther be int-erested in 
total prograit costs, but if there vere achievenient gains they vill vlsh 
to kncv at vhat relative .costs tbty vere galxied, Tney vill ieed enough 
vritten backu:) data to be stire of the evaluation unit's conoetenpe, but 
rhey vill>not visa to hear this data in detail. If there is any process 
inforipation of direct izin>ortance to outccsies, such, as a^iBajor failure in 
the inrpleffieatation of the program, they inay be interested in knoving ^this. 
3ut they vill not vant to hear about phe fine points of Che process 
evaluation. j^m 

Tne philosophy underlying this laodel is that people act in response 
to directftxis froa an authority and ustialiy siust be externally notivated 
to change their perfonnance. Contrary to nnjch of the recent hxjinan 
resources and systeass aixalysis theorizing, this siodel postulates. that in 
the cos? lex vorld of htcaan* behavior , sliiple feedback on the state of the 
systeri v±ll not be sufficient to change behavior. Tne isrnediate huaan 
cost^' f or jdiange are frequently sc great that external interT^ention and 
orders froa alxjve^are necessary. 

7 
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DrgaaizatlcaallT, rbe essentia Ls chat m^t be observed if the 
progra m accx>^zabtllzy nodel is elected is that the era^l u etloa should pe*, 

2:1 indepeiident organlzar icnal unit reporting at a very ni^ level of the 

IT? • • 

hierarchy. This is tiecessary, because credibility is the nost crsicial 

ccnnnoditv an evaluator has in this environment ♦ Tne erain^tor's clients 

vill not have time to read his iza-ssive ted^nical reports nor understand 

his fine statistics, but they inust sonehov be assured. of his coinpetence 

and integrity. Moreover, they m^st have confidence that his vork vill sot be 

filter &c of negative mf emotion. l^n:is; in the diagran of this nodal in 

Figure 1 the evaluation, unit is hierarciiically above the progran: unit. 

This position of the unit vill also be crucial, because the tota^^steat is 

likelv to- becoi^e the natural eneiEy Covert or covert) of the unit. 
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If this accountability nodel of evaluation vere strictly iirpleineited, 
it vould be the least expensive form of evaluation- in 3Doneta;ry terms, ^ince 
conncvQication and coordination with program or school staff and process 
evaluation tend to be the cost time consuEing and hence inost •e7:5>ensive 
eleuiex^ts <5f evaluation, the reduction in importance of the dovnvard flov 
of infonaatiopn in this nodel can reduce cvaliiation costs. 

' In teras of st^ff support, however, the approach strictly iaplesaented 
vill be the siost- ''«i>ensive."- Staff resentsient over the evaluator (as 
auditor) vill gradually lead to distrust- To^s distrust in turn vill ^ 
interfere vith data access and/or data reliability. 

2. Service Model ' * 

roe client in this case is the client-receiver, (usually the prograry 
staff), and evaluation assuties a service role to that staff. In the 
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service model's purest fDrm, the evaluation std>f laight proyiae to the 

staff only technical assistance such^as statistical analyses or 'data 

processing,' The evaltiation staff night even report^ to the progran staff 

adiainistratively- This organizational arrangesient is illustrated in 

Figure 1. The tvo-way directional-^ty of infonnatipn- f low in this model 

as opposed to thfe previous diagram is the'siost inn^ortaut feature. 
* 

INSERT fIlGURS. 2 

This nodel has its philosophical roots in the old rousseauian view 
of husian nature: . it is believed that man wiir opt for "good" und^ his 
own motivation— in this case, educational improvement- The underlying 
postulate is that people are always motivated J:o perfpra at their 
maximum level; evaluation serves' only to 'provide accurate information 
about the state of the program or system. It is implied that no external 
motivation or threat is needed tp provide change. 

*The evaluation staff following this model would be quickly attuned 
to in-course program changes and could pick "up on process differences as 
they occur. Ideally, evaluation services would be requested at appropriate 
times, and the data .would be iimnediately .used by the program staff • 

Several weaknesses are immediately obvious in this model. Most 
importantly, negative information is likeiy to be filtered out before 
reaching major decision-makers. This may lead to the reteth:ion of bad or 
weak programs. By the same token, the evaluation,t4esign may be 'at the^ . 
mercy of the program staff. The clients may not request or. approve 
appropriate evaluation resources, which will then lead to inaccurate or 
invalid data or its interpretation. The program staff is also unlikely 
to place a high priority on the long-term pay-offs of evaluation ' 
compared to immediate budget needs* of the program. Hence, the allocation 

10 . 
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of sufficient resources to evaluation to provide adequate process and 
outcome evaluation will be unlikely. Inadequate evaluations can tfehf lead . 
to evaluation feceiving even lower resource priority, and the downward " 
spiral continues. Perhaps the most serious problem for this model for 
public school evaluators is thatr-rte information client here is not the 
money-paying client who must eventually support the unit with tax dollars; 

« * 
,THESE TWO MODELS AKD REALITY (A THIRD MODEL) 

Extensive literature on evaluation theory has appeared addressing 
such questi,ons as the function &nd purposes o£ evaluation (Stuf flebeam, 
et al. . 1970, Provus, 1971, Stake, 1967) and the methodology of 
evaluation (-Popham, 1974, Borlch, 1974, Anderson et. al., 1975). Many of 
these earlier works have tangentlally dealt vd.th the Internal organization 
of^ evaluation units, but none hav^really made explicit the alternate 
/organizational ways In which evaluation services might be delivered within 
ai local school district. ^ 

• 

We believe these models of evaluation as service and as Accountability 

piay have Sore practical implications for the organizational. realities faced 

by public school evaluation staffs than other extant .theoretical work*^ ,. 

Regardless of how well 'or how thoroughly evaluations are conducted, the • 

translation of those evaluations into program actions is often more dependent 

upeu the organizational role which evaluation plays than upon the study 

' • ' ' ' 

itself. -We have seen numerous evaluations dropped ifato the great chasm of 

public school bureaucracy never to be heard from again because the evaluat-ors 

lacked the organizational voice, to have them heard. , 

• The 'resolution of- the cojupeting concepts of human nature and behavibr 

change which are implicit in the llichotomous mod&ls described above ^ar? not 

likely lo be totally resolved anywhere and certainly not in the political 

••10- . . 



coazext is vslch public school eraluatioa clears coday. rtro^, one is 
vziiakely zo see esrv^scbcol district evaiuatioa trait .organizatioi! that 

matches the models desoribfed." Indeed, cost units vill try to achieve -a 
vorking blend of tie, tvo^iaodels jtst as ve have over cur three years cf 
operation.' • • ' ^ 

In tJhfe l97V7i school year^zir tmit began trider an-ESrA Title III 
grant to TTOvide a sodel evaluatia;i capability »in our /iisrrict. Ag^ 
vith fuzzy thinking predoitinant in our fir§t 5^^^ vork, ve hoped to 
a^chieve accountability through providing information as ^. service to 
pragranrs on th^ achievement of tRelr objectives, vfe ranked lev tne 
school districrt hierarchy that first year^repoirting to a director vho 
r^orted to an assistant superintendent vho reported to a superintendent. 

'rortunatjely, ve vere established organizationally iiJdependent 
froffi the prograins we evaluated, in direct opposition to some local^ 

"administrative thinking. >At the end of that year ve connnunicated to the 
program staffs on hov veil they' had done vith ieir objectives. We 
manage to get a scho6i board revlev of those reports, but failed to secure 
any kind of staff or administrative connai-tment. to Jjrogfam change. Kot 
surprisingly, little program change occurred. Tae next year's funding 
from the district ftrr evaluation'^vas about the sam^— no substantial change 

3y vhatever yardsticks ve couH use to measure the effect of our 
evaluations on district programs that year, ve had to rate our unit as 
a failure. At, least, hovever, ve vere begiiming to recognize vh^t ve 

should be doing. ^ vi 

The setiond year ^ ve began zo take a look at *our ova vorX and 
to analyze our operation and our reports. Toanks also to some fortunate 
organlietlotval changes that occurred, ve reported that second year to a 



g^Vr j^^^l^st^ depstj sij^-er in t endear c'riarged virh^Dverall^ responsibility 

for instruction -^^^^ ^ereiopment. vfe 'decided that rbs pure "^serrice** 

ix«de2, se, vas a^t A viable Jfuodel and opted for a ncrre tovard the 

acco:s^taSllitT cc^de. In>Qft^r*ting, this neanc ve en g yh a s iAfed not 

crbjectives but derision' o^^tioas o^? re5>oT«- An eranple i$ indicated in 

Figure 3v ' \^ • — — - 



forgot our research **if and isarbe'^ conclusions anq opted ^for f inr, 
b-est^^idicated^c^jr-^ recommendations. KoreorefT ve begas to talk face to 
^ace to board ffisaibers and too administrators and tc recoenize that reports 
vere less y:rpzTZ5.z:z than ojr t-ersonal availabilltv vben cecisions vere 
about to occur. 

In this our third vear, ve have directly reported zo the sirperintendcn: 
and reco^ize once again the greater impact of evaluation findings on 
action >3hen input is given direct ij to rne top adn±ni5trative levels* 'Ve 

believe that our TPtrve t ovarii an accountability model has *deiaoastratet3 

ft * ' • 

greater payoff for changing edticational prograjns* Our current nfotJe'i oi~ 

operation uiight be described as in Figure ^. . \ * . 
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* ' Nonetheless, ve cannot: say that a jrts^e accountability laodel is 
viable either* "Service,"* after all, buyfe access to information, and any 
evaluation unit .that hopes, to" continue ftlnctioning has to yield; enougji 
service to keep its data channels, open. 

Tnus, ve think ve have coiae up vith a blend of the^ **Accounta- ^ 
hility-Service Kodel" in vhich ve acknovledge the true role of the 
"service" ve can reader- We do not believe that >f^feack of eyaluation 
data alone vill bjing about change in people or in eoueation. 

11 ■ 
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. Aisvers co zhese qbesrlcc^ v;««4 asslsip^ 3osr^ of Ttristees and 
of *rbe progV c JL, ^ \ ' 

% 1. ' ^asuld rbe ba^^tvl^ua^J^ £c?icaric:i (IGE) ?rogrsit be 

cantinaed ai ^^;te^j>«f^s€a^^s^^el, erpsndea, ar- discontinued? 
.2. Should rbe ICE B^ogrsit be itoiemenred cclv in scp-cols vlt±i 

student grQizps tsviz^ cerp^ indedtif iable characteristics'' 
3- Ara's^ddicitoal r«5^!frt^''lrcpucs advisable if tbe decisicns the 
first second questions are p'Ositive'' 
' Are there ^ij particular characteristics of IGZ or individuaSize: 
Instruction ct>Dse Isolententation should be encoerased in klSZ 
e2-eiiientarv szti'yz'LB' 

Progr^jja- Level Declsicn >-Lesttona 

Answers to the following questions vill assist those charged vi,th 
inplemeniing the j?rogran: in their decision making. 



1. Shrald additional training he -pro^c.^'^ 

2. If additional training is required, should it be of a particular 
rype? 

/ C. Schpol and Classroom Level Chiestions 

Ansvers to the questions below vill assist those ^a^r^ed vicb making 

decisions at tbe school and cLassroon: level, e-g., principals an<: 

» 

teachers* 
\ - . 

^ . ih -Should adaptj^tions or changes be lEad^ in the nodel processes pi' 
* . IGE as inoleaentation proceeds? 

2. Is additional resource help needed for the IGE inol ert i e ntation? ' 

3. Do particular staff -meTTrnners on imits need additional tr^-oiag 
or assistance in the in^lementation of the IGZ progras? • 

4. Are rhe^je* cercain tjpes of students vho may need to be given 
particular attention iiu,fhe IGE classroom? 



* • 
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Both teadisrs reportedly learned from this trade-^f f . One rcaarlccd that 
she had not realized a certain group of students never participated in 
her class until she sat at the back and vatched then being taught by 
soaeone else. 

Ihe interns a3Lso reported benefits from observing. Under nonsal cir- 
cu22i5tances, interns rarely have the opportunity to see anyone teach a 
class other than ^eir master teacher. As one intern put it, "By seeing 
other interns you get to see yourself vith regard to your peer groi^) — ^it 
is reassuring to knew that you are not the only one making mistakes*" 

All participants liked the exposure to ether methods of instruction 
and teaching styles. Teachers rarely have a chance to observe one another 
teaching — particularly if they are in self-contained classrooms. But even 
the teachers in the open-space school said that under usual conditions, 
they vere too busy to observe their teammate adequately. Collegial eval- 
uation gave them the chance not only to observe but to focus their obser- 
vation using specific criteria. 

The quality ot feedback exchanged in the conferences vas largely 
dependent - quality of observations. The best observers vere those 

guided by a te^^ specific criteria that were appropriate to the particular 
activity tliey observed. They learned more from their observations and 
vere J.lr * " ^f^ex their partner concrete and useful information. 

Conferences 

Conferences require the ability to give constructive criticism with- 
out damaging egos or destroying long-term relationships. As our collegial 
evaluation program specifies, teachers in the pilot test exchanged feed- 
back on three occasions: after each of the observation periods and at 
the wrap-up conference. In addition, they rated their strengths or 
weaknesses for each of the shared criteria on the self-evaluation form, 
which is similar to the observation form, making it easy to compare the 
two evaluations. In every case, participants were harder on themselves 
than their colleagues were. 
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Ihe interns vere lauch anore villing than the cloaentaiy teachers to 
give low ratings to their colleagues and to give critical feedback on the 
observation form* Interns, by definition, are "people learning the skills 
of teaching," while certificated teachers (theoretically at least) already 
possess these skills. From this perspective, it is not surprising that 
interns vere saore comfortable offering vritten criticism than the elemen* 
tary school teachers. During the conferences, however, teachers exchanged 
criticism and did more than pat one another on the back. Altliough they 
were reluctant to write down their negative cosncnts, they were usually 
quite candid in their conferences. 

An important purpose of the conferences is to develop specific strategies 
for improvement. Since the eleirentary school teachers worked together in 
the same classroom area, many of them identified problems that could be 
worked on cooperatively. For example, one pair agreed that the noise level 
in their area was occasionally too high and they discussed how, as members 
of a team, they could create a quieter learning atmosphere. Because these 
teachers worked together, they were motivated to help each other— to give 
feedback that would improve not only their individual teaching performance 
but the overall ati^sphere of their classroom. 

One teacher pointed out that a major difference between criticism during 
collegial evaluation and evaluations by an administrator was "the way crit- 
icism was phrased." We were continually impressed by the tact and diploiaacy 
exhibited in the conferences. Criticisms were frequently presented as 
suggestions for altema^rive techniques. In one teacher's words, "Instead 
of having someone say, 'you should do this', a colleague was more likely 
to say, 'something that worked well for me was this technique.'" This 
approach not only was less threatening but was perceived as nore legitlxaate. 
If the technique worked for a colleague, it was worth a try. 

The interns' conferences emphasized diagnosis rather than specific 
recommendations. They spent more time and effort analyzing teaching strengths 
and weaknesses than the elementary school teachers did. Perhaps because 
of their relative inexperience, they did not have as many concrete suggest- 
ions to offer one another and instead demoted some time at each conference 
to brainstorming alternative teaching strategies. 
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CoUegial evaluation provided positive reinforcement as well as con- 
structive criticism. Suggestions for in^rovei&ent were balanced with praise 
for effective teaching. Praise seemed to fill a very great need. As one 
teacher said, '*When your colleague praises you, it means much." Praise 
inproves teaching by reinforcing successful practices, thus encouraging 
their frequent use. In school, teachers rarely receive praise from their 
colleagues because they are not observed or evaluated by then. Though the 
value of positive reinforcement in motivating pupils is universally recog- 
nized, this practice has seldom been extended to teachers — in spite of 
the fact that the importance of teachers' job satisfaction and faculty 
morale has long been recognized by teachers and administrators alike. 

The feedback given in the conferences encompassed virtually every aspect 
of classroom activity. Teachers learned not only about their own perform- 
ance but about the overall^ climate of their classroom. For exan^ile, one 
intern noted, "There was a warm, cooperative atmosphere in this classroom* 
It was created by allowing student work groups to sit together on pillows 
on the floor and emphasizing the importance of group evaluation for the 
task." Another intern summarized his feeling for a class by telling his 
partner, "People are noisy; that doesn't bother me* They are talking, 
getting excited, and having fun." On a more critical note, an art intern 
told his partner that clean-up period was "utter chaos" and suggested that 
students be assigned responsibilities for cleaning up after themselves. 

Teachers also reported learning more about the behavior of particular 
students. One observer said of a self-directed project, "The autonomous 
kids go directly to work, but those who need a lot of teacher direction 
and support are left out." During a classroom discussion session, another 
observer noted, "ifhile most students seem to be involved, a few appear to 
be untouched by the discussion." And during a lecture presentation another 
observer said, "A couple of students did not understand; they needed ex- 
tensive clarification." These comments became catalysts for discussion in 
the conference. The observed teacher wanted to know which students were 
not autonomous, which were untouched by the discussion, and which needed 
further clarification. The partners then discussed ways to overcome these 
problems. 
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Some of the observations focused on problei&s of classroon discipline* 
Classroom control was nore frequently discusi^ed in conferences by interns 
than by teachers. Throughout the evaluation procesis* interixS helped one 
another identify which students were creating problems and what might be 
done to improve classroom order. For exaziple, one intern learned that 
"a small group of boys in the back are goofing off." Following the con- 
ference this small group was broken up and dispersed throughout the class- 
room. 

After specific discipline problems had been openly discussed in the 
conferences, both interns and teachers often took steps to solve them. 
Overlooking a particularly noisy student is difficult when a colleague has 
identified the problem through systematic evaluation and provided a just- 
ification for action. For example, many interns reported a reluctance to 
openly chastise their students. They feared that any display of authority 
would squash independence or creativity, or perhaps more important that it 
would jeopardize their students' affection for them. But when a colleague 
says that a certain student is testing the limits of tolerance (and what's 
more, that the same student creates a similar problem in his or her own 
classroom), a teacher feels more justified in trying to find sound teaching 
techniques to bring that student into line. 

Understandably, much of the feedback exchanged during conferences 
focused on the teacher's behavior in the classroom. Some discussions were 
directed at subject-matter presentation. Teachers gave each other useful 
information about the quality of materials used in lessons, the appropriate- 
ness of the language used in classroom presentations, the clarity of object- 
ives and direction, and specific techniques for making their lessons more 
interesting. These comments ranged from general observations, such as 
"The material is going over the kids' heads," to more specific one, such 
as "Your explanation of chromatic half steps was a little complicated." 
Similarly, the suggestions fpr improvement ranged from general ones 
concerning the teacher's overall performance, such as "You should take at 
least a half hour to present material you are now covering in ten minutes," 
to very specific ones, such as "Why not give each student a copy of the key- 
board to follow along during your explanation of chromatic hajf steps?" 
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The conferences also provided a forum for discussing teacher-student 
interaction, which was a natter of great concern to the participants, 
judging by both the criteria they chose for observing and the feedback they 
gave during conferences. A coronon observation was that a certain student 
or group of students was ignored. Many teachers wanted feedback concerning 
whether they used eye contact with everyone in their room, whether they 
called on different pupils rather than continually selecting the sasie ones, 
and whether they gave equal attention to students. One teacher learned 
that though she was successful in finding occasions to talk with all of 
her students individually about their art projects, most of her remarks 
were negative. In the conference her partner suggested that "students should 
gee more reinforcement on the positive aspects of their work." Teachers 
continually praised one another for using positive reinforcement.''' As one 
said, -"You gave lots of 'warm fuzzies* this inoming and it meant a lot to 
the kjds." 

On a more procedural note, participants found that holding conferences 
no more than two or three days after observations ±mproved the quality of 
feedback. Similarly, the observation form (where ratings and comments on 
the colleague's performance are written) was more useful if it was completed 
immediately after observing. But most important, teachers reported that 
the quality of their conferences ultimately depended on the willingness of 
the partners to be reasonably honest with one another. 



Teachers rarely told one another to be more critical of their, students' 
work or to develop higher expectations for their students, either individ- 
ually or as a class. They seemed to believe that each student should 
receive a lot of teacher warmth and approval regardless of his academic 
performance. We believe that this approach has serious flaws. Other 
research shows that students develop greatly inflated opinions of their 
academic skills in classrooms characterized by strong and uncritical 
teacher approval. Overstressing warmth and praise may have negative con- 
sequences, since it can lead students to have totally unwarranted beliefs 
about their academic skills. G.C. Massey, M.V. Scott, and S.M. Dornbusch, 
Racism without Racists; Institutional Racism in Urban Schools , Occasional 
Paper No. 8 (Stanford, Ca: Stanford Center for Research and Development 
in Teaching, 1975), pp. 7-10. Reprinted from The Black Scholar , 7„ No.3 
.(November 1975), pp. 10-19. 
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Self-Ai?sessaBent and Student Questionnaire 

Following the structure of our collegial evaluation program, several 
of those who participated in the pilot test distributed the student 
questionnaire to their classes and conpleted the self-assessnent form as 
part of the evaluation process. The teacher questionnaire contains items 
parallel to the student questionnaire. These allo^" teachers to identify 
sli&llarities and differences in their perceptions of themselves and their 
students' perceptions. For example, the teacher responds to the question, 
"How often do you encourage students to ask questions when they don't 
understand what's going on?" Students answer the similar question, 
"When you don't imderstand what's going on in this class, hew often are 
you encouraged to ask questions?" Like the teacher, students use a five- 
point scale which ranges (for this question) from '^always'^ to "never." 
After combining the student responses and computing a classroom average, 
the teacher can discover the level of agreement between his self- 
assessment and his students' assessment. Moreover, by looking at the dis- 
tribution of responses, a teacher might find that some students "never" 
feel encouraged to ask questions, even though most students "usually" do* 
Both the classroom average and the distribution thus provide interesting 
and useful kinds of information. 

The contribution of these questionnaires to the evaluation process 
was summarized by one teacher: 

I believe that the student questionnaire was extremely 
valuable in providing information that I myself or a third person 
could not possibly provide adequately or accurately. The specific 
kinds of questions deal with those problems that cannot be readily 
observed. They focus on those students' personal and academic needs 
that are basic to learning. 

One of the most striking results of the pilot test was the high level 
of agreement between teachers and students as shown by responses on their 
questionnaires. This similarity was not anticipated by the teachers. 
One teacher remarked, "I was very surprised to find that my own percep- 
tlons agreed fourteen out of twenty-one times (over 66Z) with the average 
of the students. I think this proved that even though my class may not 
be the greatest one in the world, my students and I certainly agree on 
what it is." Another teacher said, "The questionnaires indicate that I 
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have a realistic imderstanding of lay students' feelings toward the class 
and xayself as a teacher^" 

Despite the general agreement, there were several iteins on the 
questionnaire that produced substantial disagreeaent between teachers 
and their students. These findings raised new questions and prompted 
teachers to investigate the underlying reasons for the discrepancy. For 
exasaple, one teacher was surprised to find that on the average her students 
felt classwork was "usually" too fast and difficult. Her first inter- 
pretation was that she had overestimated her students' abilities. After 
looking r»re closely at the distribution of responses, she saw that almost 
as many students felt the work was "just right" as felt the work was 
"much too difficult." The second interpretation focused on the diversity 
of student ability in the classroom. To improve her teaching, she began 
to individualize instruction so that all of her students would be able to 
do some things well. 

General disagreement was produced between the intern teachers and 
their high school students by another interesting question: "How impor- 
tant to you ±s having the teacher like you?" Secondary students rarely 
reported that this was either "extremely" or "very" important. The 
secondary interns seemed a little hurt and surprised by their students' 
indifference. This finding generated a very fruitful discussion asjong 
interns It led to admissions that they were probably pset by this stu- 
dent report because they wanted so much to be liked by their own students. 
They had just assumed that liking was reciprocal. They confided to one 
another that wanting to be liked sometimes interfered with their better 
judgment as teachers. This conclusion was incorporated into their over- 
all plans for improvement. 

By comparison, elem^tary teachers were a little overwhelmed at their 
students' rating of the:^ teacher's importance in their lives. Almost all 
elementary students said it was "extremely important" to be liked bv ^heir 
teacher. Of course, these veteran teachers had suspected that th " u- 
dents wanted their affection, but they had not known how strong or how 
widespread this feeling was. Such unanimity in their students' responses 
made them sensitive to a number of related behaviors in the classroom. 
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For example^ after reviewing the questionnaire but prior to observation, 
one teacher noted about another, "Ihose kids are always touching you, and 
you never fail to respond." 

In addition to insights gained from students' responses on each item 
of the questionnaire, teachers discovered that examining the responses 
on several iteas at once sometimes revealed interesting patterns. For 
exasiple, one teacher discovered that her students reported being more 
confused than she had suspected. Ihey agreed that the teacher's directions 
were unclear and that they were seldom encouraged to ask questions. She 
felt that their confusion migjit be alleviated if she took measures to 
clarify her directions and encouraged them to ask questions whenever they 
were confused. 

Although anonymity was ensured on the student questionnaire, teachers 
and interns spent a lot of time guessing which students had given certain 
responses. The elementary teachers, who knew their students much better 
than the interns, seemed confident of their ability to make these guesses. 
IJhen one student responded that he "never received good grades" even when 
he did "good work," the teacher said, "I know who that is, and he's right. 
We've got to start giving him some rewards for his efforts." The teacher 
was confident that this was the same student who responded that the teacher 
never let him know when he was doing "good work." 

The participants agreed that maintaining anonymity was important if 
they wanted honest responses from students, but one lamented that "it 
would be valuable to know a particular student whose answers were radically 
different. It may be that this student is having difficult problems that 
I have overlooked or that are not obvious to me, and I would want to give 
him the special help that might be needed." 

In the pilot test, one of the interns did a fine job of developing 
his own student questionnaire. He wanted to obtain specific information 
about his skills as a choir director. He learned that his conducting was 
"fairly easy to follow," but almost half of his students felt that he 
"stayed on one piece of music too long." Most of the choir liked the 
music "O.K.," with just a few liking it "a lot" or "not much." Only two 
students thought he looked like a "madman" when conducting. These items 
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provided an excellent supplement to the nore general student questionnaire. 

Student questionnaires provide teachers with information they caimot 
obtain elsewhere. Only students can tell a teacher whether or not they 
are interested and cozifortable in the classroom. The probleics students 
perceived were translated into specific criteria for the teacher's col- 
league to observe and were discussed in the conferences. The student 
assessment was a very valuable input that the teachers took into account 
in assessing their strengths and weaknesses and making plans for icprove- 
xcent. 

Self-Assessment on Selected Criteria 

In addition to the teacher questionnaire, participants coiq>leted a 
self-assessment form based on the criteria they had selected jointly with 
their partners. After tr.:.ir teaching was observed, this self-assessment 
could be con^ared with the observation form to help focus the conference 
on areas for improvement. Overall > participants were usually much more 
critical of themselves, both in ratings and in negati\'e comments, than 
their colleagues were. They generally agreed wj^th their partners' ob- 
servations on areas of weakness, and most spent their conference in swap- 
ping ideas for improvement rather than in resolving disagreements « 

A colleague's agreement was helpful in legitimatizing a teacher's 
perce^/tion of her strengths and weaknesses. For example, one teacher 
commented, "In discussion, I tend to rely on the same students who always 
have the answers, and I do not phrase open-ended questions to include 
everyone." When her colleague noted that "two boys spoke often, a few 
girls spoke occasionally, but no one else entered the discussion," her 
self-assessment was confirmed. A good part of their first conference 
focused on how she might increase student participation. In the second 
observation her colleague noted that "the discussion included more students 
and some who had not previously participated. You praised the newcomers- 
Good." 

In her self-assessment another teacher noted a need for "some improve- 
ment" in lectures because she "relied too heavily on note cards." During 
the first observation her colleague identified the same area: "The organiza- 
tion and sequence of the lesson is good, but you occasionally stopped to 
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refer to notes." At the second observation the problem was not 3S severe 
and the colleague observed, "You relied on notes uruch less." 

Of course, not all of the probleins were so easily reoedied. In a 
self-assesszaent one teacher reported the need "to project uy foice." Her 
colleague noted, "Teadier's quiet voice tends to trail off" on the first 
observation fors. In the second observation period the colleague reported, 
"Teacher's voice does not carry above sound of the slide projector." 
This is clearly a problsa that needs to be addressed in that teacher's 
iisprovement plan. 

The laproveaent Plan 

Developing a plan for i2zq>rovezQent is the most is^ortant step of the 
collegial ev?iluation process. But the quality pf each teacher's plan 
depends on how well the other steps have been carried cut. The plan for 
approvement is formulated in a final "wrap-up" conference between the two 
partners. Each teacher integrates all the information he or she has re- 
ceived from self-assessmeivt, student questionnaires, and peer evaluation, 
and presents his partner with a composite list of strengths and weaknesses. 
Together the teachers decide on the specific strategies each will use to 
improve their teaching performance in areas of weakness. In addition, 
they determine how they will evaluate the results of these strategies. 
Finally, they identify any resources they will need to carry out their 
improvement plan. 

In our pilot test of collegial evaluation, the improvement plans 
spanned the whole range of teaching activities: presentation of subject 
matter, classroom control, motivation, student interest and involvement, 
positive reinforcement, and classroom organization and atmosphere. Tti2 
improvement plans were based on evaluations that shcved a remarkable 
amount of agreement between the teachers themselves, their colleagues, 
and their students. In most cases a teaching weakness identified by one of 
these sources was corroborated by the others. 

For example, one teacher listed as an evaluation criterion, "Do not 
ignore any segment of the class concerning questions or needs — give attention 
equally." On the student questionnaire several students reported that 
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they were "seldom" or "never" encouraged to ask questions in class* On the 
basis of classroom observation the teacher's partner noted: "The less 
capable students are not involved, especially those at the back." As 
part of the plan for ijsproveinent, the teacher specified, "With the help of 
ny peer, I vill first identify those students whoa I have ignored. I 
vill make a point of talking to each of them every day. I'll keep a check 
list to make sure I spend some time with each of these children." Another 
teacher developed a plan to deal with a sii&ilar interaction problem in a 
different way. To encourage the nonparticipators at the back, she decided 
to rearrange the class and move the pupils at the back into the first two 
rows. She also said that she would "give those individuals who have not 
been participating responsibility for explaining things to the class and 
helping others with their work." 

An intern chose as an evaluation criterion, "I present subject matter 
at a level appropriate to student ability." He was perplexed when most 
of his students reported on the questionnair4s that they were confused by 
his explanations. Then his peer commented, "You use a lot of terms which 
go way over some of these kids' heads." In his improvement plan this 
intern listed a nuisber of specific strategies to overcome the problem. 
AzQong these were: "I will try to define clearly all new terms which 1 
use in class and be more careful to write these terms and their definitions 
on the board. I'll use pretests to determine pupil knowledge in the sub- 
ject area. For those who do well on these tests I will design self- 
directed projects. This will leave me free to spend uore time with the 
slow-achievers. " 

Some of the improvement plans called for relatively minor changes; 
others envisioned a major reorganization of the classroom and substantial 
changes in teacher behavior. Two of the elementary school teachers felt 
that they both needed to maintain a quieter learning environment. Such 
a concern is not atypical in open-space classrooms. After observing one 
another, they discovered that the noisiest time of the day came when they 
grouped their students by ability in math and language arts. The noise 
came from the "low ability" youngsters, and it prevented them and others 
from concentrating. As part of their improvement plan, the teachers 
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decided that the next year they vould caq^erianent vith more heterogeneous 
groups « 

Many of the identified veaknesses were not so difficult to renedy* 
For example, one art teacher, concerned about giving appropriate positive 
reinforcement for good vork, benefited fr<Mi his colleague's observation that 
he did not have any student vork displayed in the classroon. He planned to 
"reserve a large space in the art roon, school library, and hall display 
cases for the exhibition of student vork." Another intern, whose problem 
was that he never had tine to finish his lesson, decided to save a few 
minutes each period by letting studerts distribute and collect classroom 
materials rather than doing it himself. 

For each of the specific strategies, teachers were asked to determine 
how they vould assess their progress. Plans for assessment were as varied 
as improvement strategies. Teachers planning to isxprove their presentation 
of subject matter often relied on student cognitive cutcoxdss as a measure 
of their success. The teacher mentioned above^vfao planned to explain and 
define new terms more carefully, listed as one indicator of progress the 
nuicber of times students used the new terns in their essays. 

Several teachers decided to use the student questionnaire as a post- 
test device to assess their improvement. Comparing the student response 
before and after the improvement plan was put into effect vould help them 
assess their progress in such areas as motivating students, evaluating them, 
presenting material clearly, individualizing subject matter, displaying 
interest in students, and developing material appropriate to the students' 
level. 

Almost all of the teachers planned to use collegial observation and 

conferences as a method of assessing their improvement. Hany had already 

set up times to begin another round of observations vith their colleagues. 

Others decided to change partners. The specific strategies for improvement 

would suggest new criteria for the next round of observations. One of the 

most gratifying results of the pilot test was jthat many of the participants 

considered our collegial evaluation program so useful that they planned to 

extend it throughout the school year. As one teacher said, 

I need to have this kind of collegial evaluation on 
a regular basis* If my colleague evaluated me 
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tbroughoui: the year, she vould have an under- 
standing of the trends in aiy teaching and in a 
particular class and the evaluation vould be even 
aDore helpful. She vould be able to detect subtle 
problem areas that I anay not be aware of. I could 
do the same for her and also continue to learn a 
lot by observing another teacher at vork. 

Conclusions 

We began this discussion by criticizing traditional approaches to 
teacher evaluation and advocating collegial evaluation as an alternative. 
We suaoarized research revealing that teacher evaluation prograjts are all 
veak in one or more steps of the evaluation process. According to teachers 
and administrators ve have interviewed, criteria for observation are usually 
vague or unknown, observations are infrequent, useful feedback is rare, and 
pJans for teacher improvement are alsaost nonexistent. The experiences of 
teachers in the pilot test of our collegial evaluation program gave us some 
evidence for assessing this approach and comparing it with isiore traditional 
methods of evaluating teachers. 

Most important, we learned that teachers can and will help each other 
perform better on their jobs. We also learned that teachers will take 
students' assessments of their teaching seriously and use th3m in develop- 
ing plans for improvement. 

We found that the most difficult step of our program was selecting 
criteria to serve as a basis for evaluation. But most teachers did select 
some criteria that were specific, observable, and meaningful to them. We 
also learned that thinking about their criteria helped teachers assess not 
only where they might need to Improve but what their goals as teachers were. 

We emphasized that the steps of the evaluation program are interdepend- 
ent and that a weakness in any one of them would diiolnish the program's 
usefulness. This was especially apparent in reviewing improvement plans. 
If the criteria were specific, observable, and meaningful, if the observer 
was attentive and carefully reported observations to his or her colleague, 
and if the feedback exchanged was complete and honest, then the improve- 
ment plan generated by the pair of teachers was a thoughtful and practical 
blueprint for professional growth. The message is clear; teachers cannot 
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participate in this prpgraa in a half-hearted manner. If they are to use 
It as a means for iaiproving their teachizig^ they nust conciit theaselves to 
doing a thorough and careful job at every step. 

Does coUeglal evaluation work? te believe the ansver is yes. Based 
on our pilot test we have concluded that collegial evalxjation is a useful 
approach to teacher evaluation in schools* On the whole, teachers reacted 
favorably to collegial evaluation, adapted the progran to fit their unique 
circmmstances, and gained new ideas for iaiproving their teaching. 
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